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Abstract. This paper discusses the evolution of probability distributions for certain time-dependent dy- 
namical systems. Exponential loss of memory is proved for expanding maps and for one-dimensional piece- 
wise expanding maps with slowly varying parameters. 



1. Introduction 



This paper is about statistical properties of nonautonomous dynamical systems, such as flows defined 
by time-dependent vector fields or their discrete-time counterparts described by compositions of the form 
f n o ■ • • o f 2 o fi where all the fc -.X—tX are self-maps of a space X. The topic to be discussed is the degree 
to which such a system retains its memory of the past as it evolves with time. 

Memory is lost when the initial state of a system is quickly forgotten. Conceptually, this can happen 
in two very different ways. The first is for trajectories to merge, so that in time, they evolve effectively 
as a single trajectory independent of their points of origin. This happens in systems that are contractive. 
Consider for example a system defined by the composition of a sequence of maps fi of a compact metric 
space X to itself, and assume that all the fi have a uniform Lipschitz constant L < 1, i.e., for all x, y G X, 
d(fiX, fiy) ^ Ld(x, y). Since the diameter of the image of X decreases exponentially with time, all trajectories 
eventually coalesce into an exponentially small blob, which in general continues to evolve with time (except 
when all the fi have the same fixed point). A similar phenomenon is known to occur in random dynamical 
systems. An SDE of the form 



gives rise to a stochastic flow of diffcomorphisms, in which almost every Brownian path defines a time- 
dependent flow (see e.g. [10]). When all of the Lyapunov exponents are strictly negative, trajectories are 
known to coalesce into random sinks (see [3, 13]). This phenomenon occurs naturally in applications, such as 
the Navier-Stokes system with sufficiently large viscosity (see e.g. [17, 18]), and in certain neural oscillator 
networks (see e.g. [14]). 

In chaotic systems (autonomous or not), memory is lost quickly not through the coalescing of trajectories 
but for a diametrically opposite reason, namely their sensitive dependence on initial conditions. Small errors 
multiply quickly with time, so that in practice it is virtually impossible to track a specific trajectory in a 
chaotic system. For this reason, a statistical approach is often taken. Let po denote an initial probability 
density with respect to a reference measure m, and suppose its time evolution is given by p t . As with 
individual trajectories, one may ask if these probability distributions retain memories of their pasts. We will 
say a system loses its memory in the statistical sense if for two initial distributions po and po, J \pt-p~t \ dm — > 
as t — > oo. It is this form of memory loss that is studied in the present paper. Of particular interest is when 
memory is lost quickly: we say a system has exponential statistical loss of memory if there is a number a > 
such that for any pg and po, J \p t — pt \dm < Ce~ at . Such memory loss may happen over a finite time 
interval, i.e., for t ^ T, or for all t ^ 0. 
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Observe that while the two forms of memory loss described above are quite different on the phenomeno- 
logical level, the latter can be seen mathematically as a manifestation of the first: By viewing {pt}t^a as a 
trajectory in the space of probability densities, statistical loss of memory is equivalent to pt and pt having 
a common future. The results of this paper are based on this point of view. 

Before proceeding to specific results, we first describe a model that we think is very useful to keep in 
mind, even though the analysis of this model is somewhat beyond the scope of the present work. 

Example 1.1. Lorentz gas with slowly moving scatterers. The 2-dimensional periodic Lorentz gas 
is usually modeled by the uniform motion of a particle in a domain X = T 2 \ (J i Ti where the I\ are 
pairwise disjoint convex subsets of T 2 and the particle bounces off the "walls" of this domain (equivalently 
the boundaries of the scatterers) according to the rule that the angle of incidence is equal to the angle of 
reflection. In this model, the scatterers represent very heavy particles or ions, which move so slowly relative 
to the light particle (the one whose motion is described by the billiard flow) that one generally assumes 
they are fixed. This is the traditional setup in billiard studies. In reality, however, these large particles are 
bombarded by many light particles, and we focus on only one tagged light particle. The bombardments do 
cause the large particles to move about, though very slowly, and effectively independently of the motion of 
the tagged particle. Thus one can argue that it is more realistic to model the situation as a billiard flow in 
a slowly varying environment, i.e., where the positions of the scatterers change very slowly with time. (See 
the recent work [8], which attempts to model the motion of a single heavy particle.) 

In this paper, we prove exponential loss of memory in the statistical sense discussed above for time- 
dependent systems defined by expanding and piecewise expanding maps, the latter in one dimension only. 
Expanding maps (time-dependent or not) provide the simplest paradigms for exponential loss of memory in 
the statistical sense; we use them to illustrate our ideas on the most basic level as their analysis requires 
few technical considerations. Piecewise expanding maps, on the other hand, begin to exhibit some of the 
characteristics of the time-dependent billiard maps in the guiding example above. Our results can therefore 
be seen as a first step toward this physically relevant system. 

The results of this paper apply to finite as well as infinite time, and our setting extends not only that 
of iterations of single maps (for which results on correlation decay for expanding maps and II? piecewise 
expanding maps are not new) , but it also includes skew products in which fiber dynamics are of these types 
as well as random compositions. What is different and new here is that the stationarity of the process is 
entirely irrelevant. Nor do the constituent maps have to belong to a bounded family, in which case the rates 
of memory loss may vary accordingly. A study which is closest to ours in spirit is [12]. 

Coupling methods are used in this paper, although we could have used spectral arguments, the Hilbert 
metric, or other techniques (see e.g. [6, 7, 15, 19, 20, 22, 23]). We do not claim that our methods are 
novel. On the contrary, one of the points of this paper is that under suitable conditions, existing methods 
for autonomous systems can be adapted to give results for this considerably broader class of dynamical 
settings, and we identify some of these conditions. Finally, even though coupling arguments have been used 
in more sophisticated settings, see e.g. [4, 5, 7, 23], we were unable to locate a coupling-based proof for single 
expanding maps. Section 2 will include this as a special case. 

Notation. The following notation is used throughout: given fa-.X^X for i E N, 

(1) for n ^ to, we write F n>m = /„ o • • • o f. m ; 

(2) for n 1, we write F n — F n \. 

2. Time-dependent expanding maps 

2.1. Results. Let M be a compact, connected Riemannian manifold without boundary. A smooth map 
/ : M — > M is called expanding if there exists A > 1 such that 

\Df(x)v\ > AM 

for every x € M and every tangent vector v at x. Expanding maps provide the simplest examples of systems 
with exponential loss of statistical memory. 

First we introduce some frequently-used notation. If v is a Borel probability measure on M , then we let 
fi,v denote the measure obtained by transporting v forward using /, i.e., f ie v{E) — v(f~ 1 E) for all Borel 
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sets E. If dv = tp dm where m is the Riemannian measure on M, then the density of f*v is given by Vf(<-p) 
where 

"/MM » £ jsfc 

j/e/ L x 

Here "P/ is the transfer operator associated with the map /; V Fn is defined similarly. 

In order to have a uniform rate of memory loss, we need to impose some bounds on the set of mappings 
to be composed. For A ^ and T ^ 0, define 

S(X,T):={f :M^M:\\f\\ e ^T, \Df(x)v\ > A|«| V (x,v)} 

and let 

T> := {ip > : J ip dm = 1, 93 is Lipschitz}. 

Theorem 1. Given A and T A > 1, f/iere exists a constant A = A(A,T) G (0,1) such that for any 
sequence fi G £(X,T) and any (p,tp ET>, there exists C( v ^ such that 



(2-1) / \T Fn {<P)-'PF,M\dm^C {v ^ ) A n Vn ^ 0. 

Remark 2.1. We have assumed in Theorem 1 that all of the /, are in a single £ (A,T). It will become clear 
that more general results in which A and T are allowed to vary with i can be formulated and proved by 
concatenating the arguments below. 

Remark 2.2. Correlation decay for expanding maps has been studied before. For a single map, see e.g. [19, 
21]. For random compositions, see e.g. [1, 2]. For time-dependent maps, [12] proves that f \Vf„(<p) — 
VF n (ip)\ dm — > as n —> 00 without discussing the rate of convergence. 

2.2. Outline of proof. Let e > be a small number to be determined, and fix Ao > 1 so that for all 
/ € £ = £(X,T), we have d(fx,fy) ^ Xod(x,y) whenever d{x,y) < e. Here d(-, •) denotes Riemannian 
distance. For L > 0, we define 

<p(x) 



T>l : = I ip > : J ip dm = 1 , 

Notice that V = \J L >o V l : For f e V -- 

¥>0) 1 



1 



^ Ld{x,y) if d{x,y) < e 



<p(y) 



-\p(x) - ip(y)\ < Lip ^\ d(x,y); 



<p(y) min(tp) 
functions in T>l are clearly locally Lipschitz. Key to the proof is the following observation: 

Proposition 2.3. There exists L* > for which the following holds. For any L > 0, there exists t{L) G Z + 
such that for all >p> G T>l and ft G £ , T j f„( 1 p) € £>l* for all n t(L). 

As our proof in Section 2.3 will show, the choice of L* is arbitrary, provided it is greater than a number 
determined by A and T. 

Now let fi G £ and ip,ijj G V be given. Then there exists N = N (<p>,ip) such that both V Fn (f) an d 
T J F No {i } ) are in T>l*. This waiting period is the reason for the prefactor C^,^) on the right side of (2.1). 
With this out of the way, we may assume we start with two densities <p, ip € T>l* from here on. 

Notice that all functions in T>l* are ^ K for some constant n > 0; it is easy to see from the definition of 2?l» 
that they have uniform lower bounds on e-disks. We think of the measures ipdm and ip dm as having a part, 
namely ndm, in common. Since (F^^ndm) will also be common to both (F n ) t (ip> dm) and (F n )*(ip dm), 
we regard this part of the two measures as having been "matched" . In order to retain control of distortion 
bounds, however, we will "match" only half of what is permitted, and renormalize the "unmatched part" as 
follows: Let 

V ~ h K - ip — Ik 
(2.2) = -V — - and ip = \ — - . 

Lemma 2.4. For p G 2?l» , if tp is as above, then ip G T>2L* ■ 
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Let N — t(2L*) be given by Proposition 2.3. Then ip^ '■= 'Pf n ( 1 p) and ipN '■— Ff n {4>) are in X>£*. We 
subtract off from each of (p^ and and renormalize as in (2.2), obtaining <£jy and i[>n respectively. By 
Lemma 2.4, they are in T>il*- In general, given (fi(k-i)N,' l P(k-i)N € £>2L*, we let 

(f>kN •='PF hlf>ih _ m+1 (<P(k-i)N) and i) kN :=7 , F fcJVi(fc _ 1)JV+1 (V'(fc-i)A')- 

By Proposition 2.3, <PkN>$kN G We subtract off |k and renormalize to obtain <£&jv and in X>2L* 

(Lemma 2.4), completing the induction. 

Since a fraction of ^re • m(M) of the not-yet-matched parts of the measures is matched every N steps, we 
obtain 

\PF n (<p)-VF n {$)\ dm < 2(1 -^K-m(M)f for kN < n < (k + 1)N. 

This leads directly to the asserted exponential estimate. ■ 

Remark 2.5. Theorem 1 also holds for initial densities that are not strictly positive provided one is able to 
guarantee that they eventually evolve into densities that are strictly positive. One way to make this happen 
is to have sufficiently many of the initial fi remain in a small enough neighborhood of some fixed / G £ , and 
take advantage of the fact that every expanding map / has the property that given any open set U C M, 
there exists N(U) € N such that f n (U) D M for all n ^ N(U). 

2.3. Details of proof. We begin with an essential distortion estimate. 

Lemma 2.6. There exists a constant Cq depending on Xq and T such that 

\ det DF n (x)\ < e C od (F n (x),F n (y)) 



\ det DF n (y)\ 

for all x,y € M and n G Z + with the property that d(Fk(x), Fk(y)) < s for all k < n. 
Proof of Lemma 2. 6. We have 

log l^^j^ 1 = £ (log | det Df k (F k ^(x))\ - log | det D f k (F k ^{y))\) 

n— 1 n—X 
fe=0 k=0 

< d^Ca:), f n (i/)), 

where C\ is an upper bound on the Lipschitz constant of the C 1 function log | det Df \ for any function / in 
the family £ . ■ 

We are in position to prove Proposition 2.3, which asserts the existence of L* > such that T>l* attracts 
densities. 

Proof of Proposition 2.3. Let y £ D(x, e) where D(x, e) is the disk of radius e centered at x. We let G n j be 
the i th branch of F~ l \D [x, e), and let 

i ._ <P ° G n,i 

Vn ''~ | det DF n oG n ^[ 

Then tp % n is the contribution to the density <p n := Vf„ {f) = J2i fn obtained by pushing along the i th branch. 
Estimating distortion one branch at a time, we have 

tpUx) _ f<p(G nti (x))\ (\ det DF n {G nti {y))\ 



VUV) \v{G n>i {y))J \\detDF n (G n ,{x))\, 

To estimate the first factor on the right, we use d(G n ,i(x),G n ^{y)) < \Q U d(x,y) and ip £ T>l. To estimate 
the second factor, we use Lemma 2.6. Combining the two, we obtain 



log 



sc (LXo n + C )d(x,y). 
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Exponentiating, moving (p l n (y) to the right side, and summing over i before dividing by <p n again, we obtain 



Vn{y) 

By taking e small enough, we may assume 



< e (L\-"+C )d(x,y) _ 



(2-3) 



<Pn(x) 



fn(y) 



- 1 



s£ 2 



log 



Finally, we choose t(L) large enough so that LA 



-T(L) 



<Pn{y) 

for all n ^ r(L), where L* — 4C . 

Only the proof of Lemma 2.4 remains. 
Proof of Lemma 2.4- The distortion of (p satisfies 



^ Co, and conclude that 
^L*d(x,y) 



0(x) 



<p(y) 



- i 



ip(x) 



2 K 



- 1 



1 

f{y) <p{y) 



l 



- 1 



1 _ 2" 



v(y) 



I _ 2"- 

<f(v) 



Since ip ^ k, the rightmost quantity above is ^ 2 
The proof of Theorem 1 is now complete. 



We conclude that (p £ T> 2 l* if <p £ T^l* 



3. Time-dependent ID piecewise expanding maps 

3.1. Statement of results. We consider in this section piecewise C 2 expanding maps of the circle. More 
precisely, we let S 1 be the interval [0, 1] with end points identified, and say / : S 1 — > S 1 is piecewise C 2 
expanding if there exists a finite partition A\ — A\{f) of S 1 into intervals such that for every / £ Ai, 

(1) f\I extends to a C 2 mapping in a neighborhood of /; 

(2) there exists A > 1 such that |/'(x)| ^ A for all x £ I. 

It simplifies the analysis slightly to assume A > 2, and we will do that (if A ^ 2, we replace / by a suitable 
power of / and adjust the assumptions below accordingly). 

Unlike the case of expanding maps (with no discontinuities) , compositions of piecewise expanding maps 
do not necessarily have exponential loss of memory. Indeed, systems defined by a single piecewise expanding 
map may not even be ergodic, and decay of correlations (loss of memory) in that context is equivalent 
to mixing. Some additional conditions are therefore needed for results along the lines of Theorem 1. Let 
A n '■— ViLi be the join of the pullbacks of the partition Ai and let A n \I be the restriction of A n 

to the set I. For J C S 1 , let int(J) denote the interior of J. 

Definition 3.1. We say / is enveloping if there exists N £ Z + such that for every / £ A±, we have 

f N (mt(J))=S\ 



u 



JGAn\I 

The smallest such N is called the enveloping time. 

If the enveloping time of / is N, then starting from any I £ A\, f N \I overcovers S 1 , in the sense that 
every z £ S 1 lies in f N (J) for some J £ An\I, and more than that: it is a positive distance from f N (dJ). 
From here on, our universe E is comprised of piecewise C 2 expanding, enveloping maps. 

For the same reason that many (individual) piecewise expanding maps are not mixing, one cannot expect 
the arbitrary composition of piecewise expanding maps to produce exponential loss of memory - even when 
the constituent maps have good mixing properties: this is because such properties do not necessarily manifest 
themselves in a single step. To effectively leverage the mixing properties of individual maps, we may need a 
number of consecutive /, to be near a single map. We will formulate two sets of results: a local result, which 
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assumes that all the fi are near a single piecewise expanding map g, and a global result, which allows the fi 
to wander far and wide but slowly. 

3.1.1. Local result. Let g G £ be fixed. We let £l(g) = {x\ = Xk+i, X2, ■ ■ ■ , Xk} C S 1 be the set of discontinuity 
points of g labeled counterclockwise, and let dn(g) '■— min^ \xi+i — Xi\. For e < jdn(g), we say / G £ is 
e-near g, written / G U e {g), if the following hold: 

(!) ^(/) = {2/1 = Vk+i,V2, ■ • ■ ,Vk) where |y, -a;j| < e; 

(2) if £y s maps each interval [a^, xj+i] affinely onto then on each [xj,Xj+i], 

ll/°05-3llc 2 <e ■ 

As in the case of single ID piecewise expanding maps, a natural class of densities to consider is 

V = Up G BV(S\R) : cp ^ 0, J <p(x)dx= l| . 

Recall the definitions of F n and V Fn from the end of Section 1 and the beginning of Section 2, respectively. 

Theorem 2. Let g G £. There exist A < 1 and e > sufficiently small (depending on g) such that for all 
fi G U e (g) and ip, ip G T>, there exists C^.i/i) > such that for all n G Z + ; we have 

(3-1) / \V Fn (<p)-V Pn (il>)\dx^C M) A n . 

Js 1 

There exists an extensive literature on correlation decay for ID piecewise expanding maps in the contexts 
of a single map and random i.i.d. compositions. See, e.g., [1, 2, 9, 16]. 

3.1.2. Global result. It is straightforward to verify that the collection of sets S '■= {U e (f) : f G £, e < 
\dn(f)} generates a topology on £ } Consider now a continuous map 7 : [a, b] — > £ (see Figure 1) and a finite 
or infinite sequence of fi of the form fi = 7(tj) where a ^ t\ ^ t% ^ £3 ^ ■ • • ^ b. Let A := maxi(ti + i — t{). 
If we think of the closed interval [a, b] as time, then decreasing A corresponds to decreasing the 'velocity' at 
which the curve 7 ([a, b]) is traversed. 




Figure 1. The picture we envision is that of "driving" the system along a curve 7 in 
£ and losing memory of past density distributions at variable rates depending on local 
characteristics. That, we submit, is the true nature of memory loss in dynamical systems 
with slowly varying parameters. 



Theorem 3. Let 7 : [a, b] — > £ be a continuous map. Then there exist 80 > and A < 1 (depending onj) 
for which the following holds: For every {ti} as above with A ^ 8q and (p,ip G T>, there exists C 1 ^,^) > 
such that for all relevant n, 

f \V Fn (<p) - VfM)\ dx < C (v ^K n . 
Js 1 



lr To prove S forms the basis of a topology, it suffices to check that for /i,/2 S £, si,£2 > 0, and g S W £1 (/i)nW E2 (/2), there 
exists e > such that U £ (g) C U ei (/1) n Us 2 {h)- 
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Remark 3.2. We have tried not to overburden the formulation of Theorem 3, but as will be clear from 
the proofs, various generalizations are possible: The curve can be defined on an infinite interval and can 
traverse various subregions of £ with nonuniform derivative bounds, leading to variable rates of memory 
loss. One does not, in fact, have to start with a prespecified curve and occasional long distance jumps can 
be accommodated. 

Remark 3.3. Finally, we note that Theorem 3 - together with its generalizations mentioned in Remark 3.2 
- is a simplified version of the Lorentz gas example in Section 1, an important difference being the absence 
of the stable directions. 

3.2. Proof of local result. The following is an outline of the main steps of our proof: 

Step 1. As in the expanding case (in Section 2), we represent the set of densities V as V = M T> a where the 
conditions on T> a are more relaxed as a increases, and show that there is an a* for which T> a * is an attracting 
set under Vp n for any sequence of fi in a subset of £ with uniform bounds. The time it takes to enter V a * 
from each T> a is shown to be bounded. 

Step 2. Unlike the expanding case, where all functions in this attracting set are uniformly bounded away 
from 0, and coupling (or matching of densities) can be done immediately, we do not have such a bound here. 
Instead, we guarantee the matching of a fixed fraction of the measures a finite number of steps later using 
the enveloping property of g. 

Step 3. To complete the cycle, we must show that after subtracting off the amount matched and renormalizing 
as in Section 2, functions in T> a » are in T> a for some bounded a. 

We now carry out these steps in detail. 
Step 1. For ip £ BV(5 1 ,]R), we let \/^ cp denote the total variation of (p, and let 



V, 

Clearly, V = {j a V a . Let 



a := jpeBV($\R) :<p>0, J<P = ^ V^^ a |' 



A(/) := min inf \\f{x)\ , 



and recall the following well-known inequality originally due to Lasota and Yorke. 

Lemma 3.4 (Lasota- Yorke inequality [II]). Let f be a piecewise C 2 expanding map. For ip G BV(5 1 , M.), 
we have 

l l 

(3-2) \fV f (<p) < 2A(/)- 1 \/ <p + A(f)\\<P\\» 

o o 

where 

We now fix £q C £ with uniform C 2 bounds and with g well inside £q. Let 

A := inf A(/) and A n := sup A(f). 

fes fe£o 

We assume Ao > 2. Upon repeated applications of (3.2), for fi £ £q and ip E T) we obtain 

(3-3) \JV F ,M < (2A - 1 )»V^+— ^-r, 

o 1 ~ ZA o 

which is the analog of the distortion estimate (2.3) in Section 2. 
Our main result in Step 1 is 

Proposition 3.5. Fix any a* > 4f=r - Then for every a > 0, there exists r(a) € Z + such that for all 
fi e £ Q , p € V a and n ^ r(a), V Frl {p) £ V a * . 



Proof. This is an immediate consequence of (3.3). In fact, it is enough to choose 

l-2A -\ 



(3.4) r(o) > In ( ( a* - - ^° , ) a" 1 ) / ln(2A Q - 1 ) 



among nonnegative integers. 



Step 2. The second step is perturbative. We will first work with iterates of g before extending our results 
to fi in some suitable lA £ {g). 

Lemma 3.6. There exist no G Z + and k,q > (depending on g) such that for all ip G T> a * , V g ^o (tp) ^ kq. 

Proof. Let A\ be the partition for g, and let n x be such that all elements of A ni have length < ^r- We 
will show that for every <p G T> a - there exists J = J(v?) € -A ni such that <p\ J ^ \. Suppose, to derive a 
contradiction, that for each J G ^ ni , there exists zj G J with tp(zj) < |. Then 




2 2a* 



Summing over J, this gives J sl y> < | + 5 = 1. 

Next we claim that for every J G A ni (g), there exists s = s(J) and a subinterval J s C J such that g s | J s 
is C 2 and g s (J s ) D I for some J € ,4i(<7). To prove this, we inductively define a nested sequence of intervals 
J = Ji D J2 D J3 D ■ • • as follows. Assume that Jk has been defined. If g k (Jk) 3 J for some J G set 
s = k. If not, then either g k (Jk) C / for some I G -4i(ff) or g k (Jk) intersects 2 elements of Ai(g). In the 
former case, set Jk+i — Jk, an d in the latter, let Jk+i be the longer of the 2 intervals in A\(g)\g k ( Jk)- This 
process must terminate in a finite number of steps because inf \g'\ > 2. 

Let no := So + N where sq — max{s(J) : J G -4 ni } and N = N(g) is the enveloping time for g. We 
now produce the kq with the asserted property in the lemma. Fix arbitrary p> G T> a * . Let J = J(ip) G ^4 ni 
be such that p\J > |, and let I G Ai be such that g s{J) (J s ) D I. Then P ffS(J) (^)|7 > §M ~ s(J) where 

M (g) := sup\g'\. From g N (I) = S 1 , it follows that Vg.w+xfa) ^ \M~ (s[J)+N) on 5 1 . We still have 
some steps to go if s{ J) < sq, but g is onto (as all enveloping maps are necessarily onto), and even in the 
worst-case scenario, we still have V g ^o(p) ^ ^Af^" := no everywhere on S 1 . ■ 

Define 

n 
i=l 

where Fq is the identity map. Now let fi G U e (g). From the one-to-one correspondence between elements 
of A\{fi) and Ai(g), one deduces that provided e is sufficiently small, there is a well-defined mapping 
®n ■ A n (g) — > A(F n ) where for J G Ai(s), 3>n(</) € A(F n ) has the same itinerary as J. (In general, 
need not be onto.) For J = (a, 6), let Js ■ = (a + S,b — 5). 

Lemma 3.7. Let n be as in Lemma 3.6. Then there exist e > with U e (g) C £0 and k > such that for 
all fi e U e {g), V Fno (<p) > k /or a// 9? G X> . . 

Proof of Lemma 3. 7. Let <p G D B » be fixed. In the argument below, e > and <5 > will be taken to be as 
small as is needed (e and 5 depend on g and on a* but not on tp). We let rii, J = J(y>) € (<?), s( J) G Z + , 
and L G -4i(g) be as in the proof of Lemma 3.6. In particular, e is small enough (depending on g and n\) 
so that <E>„ : A n (g) — > A(F n ) is well defined for all fi G U e {g) and the following 2 values of n: n — n\ and 
n = N, where ./V = N(g) is the enveloping time for g. 

We claim that for every J G A\{g) and G U £ {g), we may assume that Fjy(Is) = S 1 . For each 
J' G .Ajv(<?)|-f) 9 N {!') an d Fn($>n(I')) can be made arbitrarily close. This conclusion remains true if we 
shrink I' by a small amount, i.e., S (we need only do this for the leftmost and rightmost I' G A/v(<?)|-0- The 
assertion follows from this and the enveloping property of g. 

Now let fi G U e (g) be fixed, and let J' = $ ni (J). Assuming e is chosen sufficiently small, J" — J'n J 7^ 0, 
and F s rj\(J") D Is where 0" is as in the previous paragraph. Thus i 7 ' s (j)+A'( J") = S , and since F no S ^j^ + ^ + i 



is onto, it follows that F no (J") = S 1 . Noting that <p\J" ^ |, we conclude that 

*(M + :=k. 

m 

Step 3. The matching process introduces, for ip £ V a * with ip ^ k, a new density 

If — K 

9 '■= ~, • 

1 — K 

(We may subtract off any amount ^ k, the only requirement being that (p remains ^ 0.) Since subtracting a 
constant does not diminish variation, and magnifying it by a constant c magnifies the variation by at most 
c, it follows that tp £ 2? a *(i_ K )-i. 

Proof of Theorem 2. We first iterate ip and ip until "Pf„ (<^) S T^a* and 'Pf„ (VO S T> a * . This accounts for 
the prefactor C( v ^) in (3.f). We then follow the matching scheme in the proof of Theorem 1, obtaining 

A = (1 - K )("o+T-(a*(l- K )- 1 ))- 1 ^ J 

3.3. Proof of global results. Since 7([a, b]) is compact, we may assume it lies in a subset £q of £ with 
uniformly bounded derivatives and a minimum expansion Ao > 2 as in Section 3.2. This implies in particular 
that the set T) a - can be taken to be uniform for all g £ 7([a, b]). 
For each g £ £q, there are three numbers that are relevant: 

(1) e(g), which describes the size of the neighborhood in which our local results apply; 

(2) n(g) > as given by Lemma 3.7; 

(3) n(g) := no(g) + r(a*(l — k)^ 1 ) where no is as in Lemma 3.6 and involves the enveloping time of g. 
These quantities depend not just on the derivatives of g but on its geometry, i.e. how A n {g) partitions S\ 
how quickly the covering property takes hold, and so on. Our local results imply that for all fi £ U e ^ g ){g) 
and (/?, ip £ T>a* , n (g) is the number of steps at the end of which we are guaranteed that the pair of densities 
has been matched once, and that their unmatched parts, renormalized, are returned to T> a * ■ Moreover, the 
amount matched is ^ n(g). 

Proof of Theorem 3. For each t £ [a, b], let V a (t) denote the a-neighborhood of t in R, and let a(t) > be 
such that 7(V r Q ( t )(t) (1 [a, b]) C ^ e ( 7 (t))(7(i))- By compactness, there exist z\ < z 2 < ■ ■ ■ < zjj such that 
Uj v ± a ( Zj )( z j) covers [a, b}. Let gj = >y(zj), Vj = V a{zj) {zj), and \Vj = Vi^izj). Define S ■= min^- 0^. 

We claim that if U defines a partition on [a, b], the mesh A of which is ^ 5q, then the fi — j(ti) will have 
the desired properties. Consider fi for arbitrary i, and let ip,ip £ V a *. Since ti £ \Vj for some j, our choice 
of Sq assures that fi, • • • , /i+n(g 3 )-i £ ^e(gj){9j)- Thus a matching will take place, and the process can 
be repeated again at the end of n(gj) steps. Since maxj n{gj) < oo and min^ K(gj) > 0, exponential loss of 
memory is proved. I 

The argument above applies to 7 defined on a compact interval. If the curve in £ is infinite, one simply 
divides it up into suitably short segments and treats them one at a time (see Remark 3.2). 
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